Multiple Precision Integer Multiplication on GPUs

نویسندگان

  • Koji Kitano
  • Noriyuki Fujimoto
چکیده

This paper addresses multiple precision integer multiplication on GPUs. In this paper, we propose a novel data-structure named a product digit table and present a GPU algorithm to perform the multiplication with the product digit table. Experimental results on a 3.10 GHz Intel Core i3-2100 CPU and an NVIDIA GeForce GTX480 GPU show that the proposed GPU algorithm respectively runs over 71.4 times and 12.8 times faster than NTL library and GMP library, two of common libraries for single thread multiple precision arithmetic on CPUs. Another experiments show also that the proposed GPU algorithm is faster than the fastest existing GPU algorithm based on FFT multiplication if bit lengths of given two multiple precision integers are

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Design and Implementation of Multiple-precision Integer Library for GPUs

Multiple-precision modular multiplications are the key components in security applications, like public-key cryptography for encrypting and signing digital data. But unfortunately they are computationally expensive for contemporary CPUs. By exploiting the computing power of the many-core GPUs, we implemented a multiple-precision integer library with CUDA. In the previous articles, there are som...

متن کامل

Implementation of Multiple-precision Modular Multiplication on GPU

Multiple-precision modular multiplications are the key components in security applications, like public-key cryptography for encrypting and signing digital data. But unfortunately they are computationally expensive for contemporary CPUs. By exploiting the computing power of the many-core GPUs, we implemented a multiple-precision integer library with CUDA. In this paper, we will investigate the ...

متن کامل

Fast Conjugate Gradients with Multiple GPUs

The limiting factor for efficiency of sparse linear solvers is the memory bandwidth. In this work, we describe a fast Conjugate Gradient solver for unstructured problems, which runs on multiple GPUs installed on a single mainboard. The solver achieves double precision accuracy with single precision GPUs, using a mixed precision iterative refinement algorithm. To achieve high computation speed, ...

متن کامل

Conjugate Gradients on Multiple GPUs

A GPU accelerated Conjugate Gradient solver is tested on eight different matrices with different structural and numerical characteristics. The first four matrices are obtained by discretizing the 3D Poisson’s equation, which arises in many fields such as computational fluid dynamics, heat transfer and so on. Their relatively low bandwidth and low condition numbers makes them ideal targets for G...

متن کامل

Algorithm Exploration for Long Integer Modular Arithmetic on a SPARC V8 Processor with Cryptography Extensions

In recent years, public-key cryptography has emerged to become an important workload for embedded processors, driven by a number of factors such as the need for securing wireless communication. The computational requirements of public-key cryptosystems are often beyond the modest capabilities of embedded processors, which motivated the development of architectural enhancements and instruction s...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014